Multi-layer Perceptron (MLP) for Binary Classification

Dataset

a random n-class classification dataset can be generated using sklearn.datasets.make_classification. Here, we generate a dataset with two features and 1000 instances. Moreover, the dataset is generated for multiclass classification with five classes.

In [1]:
import numpy as np
import pandas as pd
from sklearn.datasets import make_classification
from num2words import num2words

n_features =2
n_classes = 4
X, y = make_classification(n_samples = int((n_classes-1)*1e3), n_features = n_features, n_redundant=0, n_classes = n_classes,
                           n_informative=2, random_state=1, n_clusters_per_class=1)
Labels_dict = dict(zip(list(np.unique(y)), [num2words(x).title() for x in np.unique(y)]))

Data = pd.DataFrame(data = X, columns = ['Feature %i' % (i+1) for i in range(n_features)])
Target = 'Outcome Variable'
Data[Target] = y
display(Data)

from HD_DeepLearning import Plot_Data

PD = dict(BP = .5, alpha=.7, bg_alpha = 0.25, grid = True, cricle_size = 50,
          FigSize = 7, h=0.02, pad=1, ColorMap =  'tab10', Labels = list(Labels_dict.values()))

Plot_Data(X, y, PD = PD, Labels_dict = Labels_dict, ax = None)
Feature 1 Feature 2 Outcome Variable
0 -1.689380 1.233636 2
1 -1.357824 1.236826 2
2 1.104089 1.308663 3
3 -1.395940 -0.465010 0
4 -0.221240 -2.495895 0
... ... ... ...
2995 -1.457264 1.179276 2
2996 0.998499 1.209994 3
2997 -0.301584 -2.114714 0
2998 1.111228 0.011635 3
2999 -1.316420 -0.448300 0

3000 rows × 3 columns

Train and Test sets

In [2]:
Pull = [.01 for x in range((len(Labels_dict)-1))]
Pull.append(.1)

import plotly.express as px
from HD_DeepLearning import DatasetTargetDist
PD = dict(PieColors = px.colors.sequential.deep, TableColors = ['Navy','White'], hole = .4,
          column_widths=[0.6, 0.4], textfont = 14, height = 400, tablecolumnwidth = [0.25, 0.15, 0.15],
          pull = Pull, legend_title = Target, title_x = 0.5, title_y = .9, pie_legend = [0.01, 0.01])
del Pull
DatasetTargetDist(Data, Target, Labels_dict, PD, orientation= 'columns')

StratifiedKFold is a variation of k-fold which returns stratified folds: each set contains approximately the same percentage of samples of each target class as the complete set.

In [3]:
from sklearn.model_selection import StratifiedShuffleSplit

Test_Size = 0.3
sss = StratifiedShuffleSplit(n_splits=1, test_size=Test_Size, random_state=42)
_ = sss.get_n_splits(X, y)
for train_index, test_index in sss.split(X, y):
    # X
    if isinstance(X, pd.DataFrame):
        X_train, X_test = X.loc[train_index], X.loc[test_index]
    else:
        X_train, X_test = X[train_index], X[test_index]
    # y    
    if isinstance(y, pd.Series):
        y_train, y_test = y[train_index], y[test_index]
    else:
        y_train, y_test = y[train_index], y[test_index]
del sss

from HD_DeepLearning import Train_Test_Dist  
PD.update(dict(column_widths=[0.3, 0.3, 0.3], tablecolumnwidth = [0.2, 0.4], height = 550, legend_title = Target))

Train_Test_Dist(X_train, y_train, X_test, y_test, PD, Labels_dict)
#
import tensorflow as tf
y_train = tf.keras.utils.to_categorical(y_train, num_classes=len(Labels_dict))
y_test = tf.keras.utils.to_categorical(y_test, num_classes=len(Labels_dict))

Modeling: Multi-layer Perceptron (MLP) for Binary Classification

A multi-layer perceptron (MLP) is a class of feedforward artificial neural networks (ANN). scikit-learn.org has a well-written article regarding MLP and interested readers are encouraged to see this article.

Moreover, in this article, we present a multi-class MLP using Keras and focus on implementing this method in Keras. We define our model by using Sequential class. Moreover, we consider the rectified linear unit) (ReLU) as the activation function. An activation function allows for complex relationships in the data to be learned. For the last year, we use the softmax function, also known as softargmax or normalized exponential function.

In [4]:
model = tf.keras.Sequential(name = 'Multi_Class_MLP')
model.add(tf.keras.layers.Dense(64, input_dim = X.shape[1], activation='relu', name='Layer1'))
model.add(tf.keras.layers.Dropout(0.5))
model.add(tf.keras.layers.Dense(64, activation='relu', name='Layer2'))
model.add(tf.keras.layers.Dropout(0.5))
model.add(tf.keras.layers.Dense(len(Labels_dict), activation='softmax', name='Layer3'))
model.summary()
tf.keras.utils.plot_model(model, show_shapes=True, show_layer_names=True, expand_nested = True, rankdir = 'LR')
Model: "Multi_Class_MLP"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 Layer1 (Dense)              (None, 64)                192       
                                                                 
 dropout (Dropout)           (None, 64)                0         
                                                                 
 Layer2 (Dense)              (None, 64)                4160      
                                                                 
 dropout_1 (Dropout)         (None, 64)                0         
                                                                 
 Layer3 (Dense)              (None, 4)                 260       
                                                                 
=================================================================
Total params: 4,612
Trainable params: 4,612
Non-trainable params: 0
_________________________________________________________________
Out[4]:
In [5]:
# Number of iterations
IT = int(1e2)+1

model.compile(optimizer= tf.keras.optimizers.SGD(learning_rate=0.01, decay=1e-6, momentum=0.9, nesterov=True),
              loss='categorical_crossentropy', metrics=['accuracy', tf.keras.metrics.Recall()])

# Train model
history = model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs= IT, batch_size=128, verbose = 0)
In [6]:
def Search_List(Key, List): return [s for s in List if Key in s]

Metrics_Names = {'loss':'Loss', 'accuracy':'Accuracy', 'mae':'MAE', 'mse':'MSE', 'recall': 'Recall'}

from HD_DeepLearning import history2Table

Validation_Table = Search_List('val_',history.history.keys()) 
Train_Table = list(set( history.history.keys()) - set(Validation_Table))
Validation_Table = pd.DataFrame(np.array([history.history[x] for x in Validation_Table]).T, columns = Validation_Table)
Train_Table = pd.DataFrame(np.array([history.history[x] for x in Train_Table]).T, columns = Train_Table)
Validation_Table.columns = [x.replace('val_','') for x in Validation_Table.columns]

Train_Table = history2Table(Train_Table, Metrics_Names)
Validation_Table = history2Table(Validation_Table, Metrics_Names)

# Train Set Score
score = model.evaluate(X_train, y_train, batch_size=128, verbose = 0)
score = pd.DataFrame(score, index = model.metrics_names).T
score.index = ['Train Set Score']

# Validation Set Score
Temp = model.evaluate(X_test, y_test, batch_size=128, verbose = 0) 
Temp = pd.DataFrame(Temp, index = model.metrics_names).T
Temp.index = ['Validation Set Score']
score = pd.concat([score, Temp])
score.rename(columns= Metrics_Names, inplace = True)
score = score.reindex(sorted(score.columns), axis=1)
display(score.style.format(precision=4))
  Accuracy Loss Recall
Train Set Score 0.9195 0.2762 0.9129
Validation Set Score 0.9156 0.2929 0.9078
In [7]:
from HD_DeepLearning import Plot_history
PD = dict(row_heights = [0.4, 0.6], lw = 1.5, font_size=12, height = 700, yLim = 1,
          th_line_color = 'Navy', th_fill_color='darkslategray', table_columnwidth = [0.4, 0.4, 0.4, 0.4],
          tc_line_color = 'Navy', tc_fill_color = None, title_x = 0.46, title_y = 0.94, tb_cell_heigh = 20,
          Number_Format = '%.4e')

Plot_history(Train_Table, PD, Title = 'Train Set', Table_Rows = 10, Colors = ['RoyalBlue', 'DarkGreen', 'Red'])
Plot_history(Validation_Table, PD, Title = 'Validation Set', Table_Rows = 10, Colors = ['RoyalBlue', 'DarkGreen', 'Red'])
In [8]:
from HD_DeepLearning import Plot_Classification
import matplotlib.pyplot as plt

PD = dict(BP = .5, alpha=.7, bg_alpha = 0.15, grid = False, cricle_size = 50,
          FigSize = 7, h=0.02, pad=1, ColorMap =  'Set1', Labels = list(Labels_dict.values()))

fig, ax = plt.subplots(1, 2, figsize=(16, 7))
# Train Set
Plot_Classification(model, X_train, y_train.argmax(axis = 1), PD = PD, ax = ax[0])
_ = ax[0].set_title('Train Set', fontsize = 16, weight='bold')
# Test Set
Plot_Classification(model, X_test, y_test.argmax(axis = 1), PD = PD, ax = ax[1])
_ = ax[1].set_title('Test Set', fontsize = 16, weight='bold')

Confusion Matrix

The confusion matrix allows for visualization of the performance of an algorithm. Note that due to the size of data, here we don't provide a Cross-validation evaluation. In general, this type of evaluation is preferred.

In [9]:
from sklearn import metrics

# Train
y_pred = model.predict(X_train).argmax(axis = 1)
Reports_Train = pd.DataFrame(metrics.classification_report(y_train.argmax(axis = 1),
                                                           y_pred, target_names=list(Labels_dict.values()),
                                                           output_dict=True)).T
CM_Train = metrics.confusion_matrix(y_train.argmax(axis = 1), y_pred)
# Test
y_pred = model.predict(X_test).argmax(axis = 1)
Reports_Test = pd.DataFrame(metrics.classification_report(y_test.argmax(axis = 1),
                                                          y_pred, target_names=list(Labels_dict.values()),
                                                          output_dict=True)).T
CM_Test = metrics.confusion_matrix(y_test.argmax(axis = 1), y_pred)

Reports_Train = Reports_Train.reset_index().rename(columns ={'index': 'Train Set'})
Reports_Test = Reports_Test.reset_index().rename(columns ={'index': 'Test Set'})
                                                 
display(Reports_Train.style.hide(axis='index').set_properties(**{'background-color': 'HoneyDew', 'color': 'Black'}).\
        set_properties(subset=['Train Set'], **{'background-color': 'SeaGreen', 'color': 'White'}))
display(Reports_Test.style.hide(axis='index').set_properties(**{'background-color': 'Azure', 'color': 'Black'}).\
        set_properties(subset=['Test Set'], **{'background-color': 'RoyalBlue', 'color': 'White'}))

from HD_DeepLearning import Confusion_Mat
PD = dict(FS = (14, 6), annot_kws = 14, shrink = .6, Labels = list(Labels_dict.values()))
Confusion_Mat(CM_Train, CM_Test, PD = PD, n_splits = None)
Train Set precision recall f1-score support
Zero 0.916817 0.969407 0.942379 523.000000
One 0.867424 0.869070 0.868246 527.000000
Two 0.967181 0.952471 0.959770 526.000000
Three 0.928144 0.887405 0.907317 524.000000
accuracy 0.919524 0.919524 0.919524 0.919524
macro avg 0.919892 0.919588 0.919428 2100.000000
weighted avg 0.919863 0.919524 0.919383 2100.000000
Test Set precision recall f1-score support
Zero 0.923729 0.973214 0.947826 224.000000
One 0.855263 0.862832 0.859031 226.000000
Two 0.963636 0.942222 0.952809 225.000000
Three 0.921296 0.884444 0.902494 225.000000
accuracy 0.915556 0.915556 0.915556 0.915556
macro avg 0.915981 0.915678 0.915540 900.000000
weighted avg 0.915905 0.915556 0.915441 900.000000